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1. INTRODUCTION 

Parkinson's disease (PD) can be defined as a neurodegenerative syndrome that leads to the 
progressive deterioration of motor abilities as a result of the damaged brain cells responsible for the 
production of dopamine [1]. Common symptoms include shakiness, difficulty moving, behavioral issues, 
depression, dementia, tremor, handwriting alterations, muscle rigidness, and posture/balance impairment. The 
main symptoms all together are also known as Parkinsonism or Parkinson’s syndrome [1]. Alterations in a 
patient’s voice are a commonly occurring symptom whose identification could be done by means of 
analyzing the patient’s speech data. It has been observed that the patient’s voice is affected gradually along as 
the disease intensifies and they may start stuttering [2]. 

A Parkinson’s syndrome affects both male and female patients, and tends to develop after the age of 
60. However, there are cases of PD in patients before the age of 50 [3]. The fact that the (early) diagnosis of 
Parkinson’s is rather challenging has been the motivation to develop a decision support system (DSS) for 
helping the medical staff in diagnosing Parkinson’s. Such a system could function as a second opinion in 
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diagnosing Parkinson’s, as the use of machine learning (ML) reduces the likelihood of errors [4]. Since 
researchers have used several ways such as single photon emission computed tomography (SPECT), 
magnetic resonance imaging (MRI), and handwritten images, as well as changes in a speech called dysphonia 
for PD’s diagnosis, this work involves the use of speech changes in diagnosing PD [5]. Different ML 
techniques have been used over time in building DSSs, such as preprocessing, attribute selection, 
classification, and validation steps. ML helps in analyzing disease patterns in medical data sets, as well as 
making decisions in a shorter time [6]. 

Pre-processing techniques cover procedures like data normalization, attribute selection, and 
balancing. Attribute selection decreases computational expenses and expands its accuracy. First, the 
attributes are selected via three attribute selection methods: correlation, information gain, and variance 
threshold. The reduced attribute subset was adopted to train and test the classifiers in identifying the ideal 
combinations of attribute method and classifiers. Second, the Parkinson’s speech dataset was found to be of 
no balance, as 147 out of 195 samples were from individuals suffering from Parkinson’s. Therefore, shuffle 
was applied for treating the lack of balance. At last, a performance analysis for the three classifiers (naive 
bayes (NB), decision trees (DT), and support vector machine (SVM)) is conducted on full and reduced 
attribute sub-sets. It is noticed that combining the information gain algorithm with the DT classifier leads to 
more favorable results than the other methods. 

Gupta et al. [7] utilized two ML algorithms in analyzing the artificial neural networks (ANN) and 
random forest (RF) classifiers for predicting PD. The data set used by the authors in their experiment is 
obtained from the repository located at the University of California-Irvine (UCI). The adopted dataset 
contains 754 attributes without missing values. The class labels (0) and (1) indicate whether or not the 
disease occurs. The principal component analysis (PCA) is applied for selecting the optimal attributes in the 
classifying process. The experimental results indicate that using ANN and PCA combined leads to better 
results than using it in combination with the RF classifier. Senturk [8] made use of ML algorithms for 
diagnosing PD. The attributes were chosen via the recursive feature elimination (RFE) method, to determine 
the best attributes. ANN, SVM, and the regression tree were implemented in the classification process. 
Combining RFE and SVM realized an accuracy rate of 93.84%. 

Tuncer et al. [9] used vowels to diagnose Parkinson’s syndrome. The attributes were selected using 
the relief-based method. Eight classification algorithms were used in their work. The k-nearest neighbor 
(KNN) classifier achieved an accuracy of 92.46% and thereby outperformed the rest of the classifiers. 
Sharma et al. [10] used a variety of ML algorithms for diagnosing Parkinson's syndrome. They used the PD 
speech datasets which are provided by the UCI's ML repository. The authors implemented the algorithm of 
modified gray wolf optimization to select the best attributes. Three classifiers have been adopted: RF, KNN, 
and DT. The experimental results showed that the best accuracy achieved by the classifiers based on the 
speech dataset is 93.87%. The content of this article is divided in the following way: section 2 describes 
outlines of materials and methods, section 3 states the observations made throughout the experiment and 
followed by analysis, section 4 states the results and discussion, and at last, section 5 states the conclusion. 


2. MATERIALS AND METHOD 
2.1. Parkinson’s-speech dataset 

Studies indicate a constant patterning of vocal deterioration in the main cases of Parkinson’s. 
Therefore, this work addresses the distinction between patients who suffer from Parkinson’s from those who 
are healthy, via the analysis of patients’ speech signals [11]. The benchmark Parkinson’s-speech dataset used 
in the present paper is an open access dataset and can be downloaded freely available at the UCI. It contains 
195 instances with 23 numeric attributes for Parkinson’s patients whose voice have been recorded for study 
purposes. 

The data indicates the status by means of binary values: (0) states that the patients suffer from PD. 
The proposed method is examined by means of the same UCI dataset [11]. Table 1 states the information on 
the dataset. Table 2 depicts the statistical issues of classes in the dataset. Figure 1 shows the bar chart of 
distribution classes in the dataset, while the description of the 23 attributes are shown detail in Table 3. 


Table 1. Description of selected datasets Table 2. The statistics of classes in the dataset 
Name of dataset Parkinson speech Class Instances Distribution (%) 
Number of instances 195 Parkinson (1) 147 75.38 
Number of attributes 23 Healthy (0) 48 24.62 
Class variable Healthy and Parkinson Total 195 100 


Bulletin of Electr Eng & Inf, Vol. 12, No. 6, December 2023: 3365-3373 


Bulletin of Electr Eng & Inf 


ISSN: 2302-9285 Oo 


® Parkinson 


m Healthy 


Figure 1. Class distribution in the dataset 


Table 3. Attribute description 
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Attribute name 


Description of abbreviation 


#1_(MDVP-FO (Hz)) 
#2_(MDVP-Fhi (Hz)) 
#3_(MDVP-Flo(Hz)) 
#4_(MDVP-Jitter (%)) 
#5_(MDVP-(Abs)) 
#6_(MDVP RAP) 
#7_(MDVP-PPQ) 
#8_(Jitter-DDP) 
#9_(MDVP-Shimmer) 
#10_(MDVP-Shimmer (dB)) 
#11_(MDVP-APQ) 
#12_(Shimmer-APQ3) 
#13_(Shimmer APQS5) 
#14_(Shimmer-DDA) 


Multidimensional voice_program represents average-vocal fundamental frequencies 
Multidimensional voice_program represents maximum-vocal fundamental frequencies 
Multidimensional voice_program represents minimum-vocal fundamental frequencies 
MDVP_fjitter in percent 

MDVP-_absolute jitter in micro-seconds 

MDVP_relative amplitude perturbation 

MDVP_period perturbation quotient 

Average absolute difference of differences between cycles, divided by the average period 
MDVP_local shimmer 

MDVP_local shimmer in decibels 

MDVP_amplitude perturbation—quotient 

3-Point_amplitude perturbation-quotient 

5-Point amplitude perturbation quotient 

Average absolute difference between consecutive differences between the amplitude of 
consecutive periods 


#15_(NHR) Noise harmonic-ratio 
#16_(HNR) Harmonics noise-ratio 
#17(DFA) Detrended fluctuation analysis 


#18_(Spread1) 
#19_(Spread2) 


Fundamental frequencies nonlinear measures 
Nonlinear measures of fundamental frequencies 


#20_(D2) Correlation dimensions 
#21_(PPE) Pitch period-entropy 
#22_(RPDE) RPDE_recurrence period density entropy 


#23_(Status) 


(0) Healthy; (1) Parkinson 


2.2. Attribute ranking method 

Attributes in this type of method are selected based on specific performance metrics with no regard 
to prediction algorithms. Therefore, these methods are used before the prediction models [12]. Three ranking 
methods have been implemented to evaluate and rank each attribute in the Parkinson’s-speech dataset [13]. 
The attribute ranking methods adopted within this system are outlined in the following sections. 


2.2.1. Correlation method 

This method individually measures the correlation between each attribute in the dataset and the 
target class [14]. The attribute weight ranges between 1 and -1, so that the attribute is considered very 
weakened if its weight is close to zero, meaning that the attribute is not related to the target class, while it is 
considered very strong if its weight is close to +1, meaning that the attribute is highly related to the target 
class [14]. The correlation between each attribute and the target class is calculated in (1): 


LXX YY) 
(1) 
z os v-r? 


cor(x,y) = 
J 


where X is representing the attribute, Y is representing the target class, Y is representing the average of the 
target class, and X is representing the average of the attribute. 


2.2.2. Information gain method 

It is an essential and commonly used method for selecting attributes. The significance of attributes is 
determined in comparison with the general class. In case the information gain value of an attribute exceeds a 
particular threshold, it is considered to be an important attribute. Therefore, it is often adopted in reducing the 
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dimensions and increasing the efficiency of the classifying process. The information gain of each attribute 
with the target class could be obtained using (2) [15]: 


IG(S.t) = E(S) — E(S\t) (2) 


where E(S) is the entropy of a random variable S (target class) and E(S\t) is the conditional entropy of S 
given the value of the attribute (t). 


2.2.3. Variance threshold method 

Variance threshold method is an attribute selection method that removes all the low variance 
attributes from the dataset that are of no great use in modeling. Constant attributes show constant values in all 
observations of the dataset. These attributes provide no information that allows ML models to predict the 
target efficiently [16]. The variance for each attribute is calculated in (3): 


Variance=}f_; (Xi-—X)*/n (3) 
where Xi is the values of an attribute, X is the mean, and n the number of instances. 


2.2.4. Decision trees technique 

DT can be described as a prediction model used in the mapping of observations made of a certain 
item, to conclude upon the item’s target values. The structure of DT includes root, internal, and leaf nodes. It 
could be defined as some sort of flow chart with a tree-like structure, whereby internal nodes denote test 
conditions on attributes, branches represent the results of test conditions, and the leaf-terminal nodes are all 
assigned class labels [17]. The highest top-node is known to be root of the DT. Overall, this type of structures 
has a “divide and conquer” approach, whereby all of the paths form decision rules by themselves. The 
benefits of DT include the fast classification processes, strong learning abilities, and relatively simple 
structures [17]. 


2.2.5. Naive bayes technique 

NB is a simple classification method that is based on identifying the probabilistic relations among 
classes and attributes [18]. It depends on the bayesian theory for computing the target probability using 
values of certain predictors or attributes. It is more mode favorable than other probability classifiers, as it 
computes the most likely output using the provided input [19], [20]. 


2.2.6. Support vector machine 

SVM is a classifying algorithm used with both non-linear and linear data. It works by transforming 
the originally used training data into higher dimensions via non-linear mapping. Next, the model aims to 
identify the hyper-plane linear optimal separation [21]. Using suitable non-linear mapping towards higher 
dimensions that are more sufficient, the hyper-plane can separate the data of two classes. To classify data, 
SVM maximizes the margins of both classes and minimizes classification errors. Other applications of SVM 
include cases of regression [22], [23]. 


2.3. Performance measures 

The present study adopts four common performance metrics for evaluating the accuracy of 
classification algorithms. The confusion matrix which appears in Figure 2 records the correct and incorrect 
classification results to measure the quality of the classifier [24], [25]. 


True Clas 


Positive Negative 


Positive 


Figure 2. Class distribution in the dataset 


Predicted Class 


Negative 


where TP is true-positive, FP is false-positive, FN is false-positive, and TN is true-negative. 
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Accuracy = aD (4) 
Recall = Poa (5) 
Precision = aoe (6) 
F — Measure — zr Recalls Precision) (7) 


(Recall + Precision) 


3. THE PROPOSED METHOD OF THE STUDY 

The architecture of the proposed methodology involves for three stages to achieve the goal of this 
study. In the first step, ranking methods have been applied for attribute selection. In the second step, 
classification models are applied for the prediction task. Finally, the classification models are evaluated based 
on various measures. The block diagram of the suggested method stages is explained in Figure 3. 
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Figure 3. The architecture of the proposed method 


Stage 1 attribute selection: several attribute selection methods are used on the Parkinson’s-speech 
dataset to reduce the attribute space. Thus, a subset of the most important attributes is selected among the 
original ones. The attribute selection methods used are correlation, information gain, and variance threshold. 
These methods are applied before the classification model for selecting the attributes according to the 
performance measures, with no regard to the classification algorithms. The key role of the attribute selection 
method is identifying the most important attributes which directly affect the target class (Parkinson and 
healthy). These methods evaluate the attributes and give a different rank value for each one of them. All 
weak attributes have been deleted through a predefined threshold. As for the class imbalance problem, 
shuffle has been used to handle this issue. Stage 2 prediction stage: this stage represents the most important 
step in the proposed method. Three different classification models have been used for validating how 
accurate the selection of these attributes, ensuring that these selected attributes are indeed most likely to 
influence the target class (Parkinson and healthy). Stage 3 evaluation of prediction model: in this stage, 
accuracy, recall, precision, and F-measure performance measures are utilized for measuring the efficiency of 
the classification models. 
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4. RESULTS AND DISCUSSION 

The suggested methodology follows the concept of classification tasks to classify the class label 
(Parkinson or healthy) in the Parkinson’s-speech dataset. The hold-out-validation method (80% for training 
and 20% for testing) is used for validating the results. At first, the weight of each attribute is calculated using 
the correlation method. Next, the top (12) attributes are determined depending on a predefine threshold value. 
In the information gain method, the weight for each attribute is computed, and any attribute with a weight 
less than the predefine threshold value is discarded. The output of this method performs well, as only 5 
attributes are selected. In the variance threshold method, the variance value for each attribute is computed 
and any attribute that does not achieved a predefine threshold is neglected from the dataset. The yielded 
results from the variance threshold method are 12 attributes. Table 4 presents the selection of the attributes 
via the attribute selecting method. 


Table 4. Attributes selected by attribute selection methods 


Attribute No of attribute Attübutesnäme 
selection method selected 
Correlation 12 Shimmer: DDA, Shimmer: APQ3, MDVP: Shimmer(dB), Shimmer: APQ5, HNR, MDVP: 
APQ, MDVP: Shimmer, MDVP: Flo(Hz), MDVP: Fo(Hz), spread2, PPE, spread1 
Information gain 5 PPE, spread 1, MDVP: Fo(Hz), spread 2, MDVP: APQ 
Variance 12 MDVP: Fo(Hz), MDVP: Fhi(Hz), MDVP: Flo (Hz), MDVP: Shimmer(dB), NHR, HNR, 
threshold RPDE, DFA, spread1, spread2, D2, PPE 


After the selection of attributes via several attribute selection methods, the efficiency of three ML 
classifiers via differing attribute sub-sets was evaluated. It has been found the results for all classifiers with 
attribute selection methods archive the best accuracy. Table 5 states the efficiency rates of NB, DT, and SVM 
classifiers for all attributes once and again for the reduced attribute sub-sets. 


Table 5. Performance of classifiers with attribute selection methods 
Attribute selection algorithm _All attributes _ Correlation _Information gain __ Variance threshold 
NB classifier 


Accuracy (%) 69.23 82.05 89.74 84.61 
Precision (%) 75 81 87 82 
Recall (%) 78 88 93 89 
F-measure (%) 69 81 88 83 
DT classifier 
Accuracy (%) 84.61 92.30 97.43 94.87 
Precision (%) 82 92 98 97 
Recall (%) 84 89 95 91 
F-measure (%) 83 90 97 93 
SVM classifier 
Accuracy (%) 87.17 89.74 94.87 92.30 
Precision (%) 92 94 97 95 
Recall (%) 79 82 91 86 
F-measure (%) 83 86 93 90 


Figure 4 presents an analysis whereby the enhancement in classification accuracy is compared. It 
draws a comparison between three classifiers via the attribute selection methods, as an improvement has been 
observed in the reduced attribute sub-sets. It has been found that the information gain method has a better 
performance than the alternative selecting methods having rates of 89.74%, 97.43%, and 94.87% for NB, DT, 
and SVM, respectively. 

Table 6 compares the suggested methodology and the methodologies in previous studies. Figure 5 
illustrates graphically the accuracy improvement of the proposed methodology as compared to the previous 
methodologies which has been implemented through other authors. All codes conducted to implement the 
proposed hybrid methods were executed in Python language (version 3.7) with jupyter notebook lab under 
Windows 64-bit OS environment, Intel Core i7 processor, 6 GB memory, and a NVIDIA GeForce GTX 2 
GB graphics. 
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Figure 4. Comparative analysis of attribute selection methods with classifiers 


Table 6. Performance comparison with previous studies 


Reference Attribute selection methods Classifies model Accuracy (%) 
[17] Cuttle-fish algorithm KNN DT 92.19 
[10] Grey-wolf algorithm RF, KNN, DT 93.87 
[8] RFE and attribute significant algorithm SVM, classification trees, ANN 93.84 
[18] Genetic algorithm, extra tree, and mutual information NB, RF KNN 95.58 

Proposed method __ Correlation, information gain, and variance threshold _ SVM, DT, NB 97.43 
98 ~ 
97 4 97:43 
96 
n 954 95.58 
e 
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S 44 
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Figure 5. Comparison of accuracy with previous studies 


5. CONCLUSION 

This study contributed to presenting a proposed hybrid approach to improve the accuracy of 
classification (class label: Parkinson or healthy) in the Parkinson’s-speech dataset. No particular method has 
been assigned for selecting a universal attribute and a universal classifier for a medical data set. To find the 
best results, researchers have to try different methods to achieve the best combination. The main aim of using 
attribute selecting methods is to select the best subset of attributes by eliminating the attributes which no 
predictive information. The results indicate that using the attribute selecting method is beneficial due to the 
reduction in time and increase in simplicity and accuracy. The hybrid method that has been introduced in this 
work has proven to yield better results than alternative approaches, realizing an accuracy of 97.43%. It can 
therefore be concluded that the proposed method does not substitute the healthcare experts, but rather 
functions as a second opinion in diagnosing PD. Further research is aimed to study the efficiency of the 
suggested methodology on other speech and voice data sets. 
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